Add Windows Ninja Multi-Config build with sccache caching by bernardladenthin · Pull Request #248 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-06-20T17:37:04Z

Summary

Ship Windows natives built with Ninja Multi-Config + sccache/Depot as a parallel ninja-windows classifier JAR, alongside the existing MSVC/Visual Studio build (which remains the default).
Add two new CI jobs (build-windows-x86_64-ninja, build-windows-x86-ninja) that build with Ninja, cache via sccache + Depot WebDAV, and run C++ unit tests.
Add test-java-windows-x86_64-ninja to validate the Ninja-built DLL end-to-end via JNI.
Implement sccache probe guard in .github/build.bat (mirrors build.sh) to safely enable the compiler launcher only when sccache is present and working.
Add windows-ninja Maven profile to produce the classifier JAR from a separate resource tree (src/main/resources_windows_ninja/).
Update package, publish-snapshot, and publish-release jobs to download and activate the Ninja build.

Rationale

The Visual Studio generator ignores CMAKE_{C,CXX}_COMPILER_LAUNCHER, so the MSVC Windows jobs cannot use sccache caching. Rather than switch the trusted MSVC build, this change builds the same CPU natives a second time with Ninja Multi-Config (which does honor the launcher) and ships them as an optional classifier JAR. The MSVC build is kept as the permanent default; the Ninja artifact is an additional, cache-accelerated, independently validated option for users to compare/adopt. Upstream llama.cpp ships its windows-cuda artifact with Ninja Multi-Config + MSVC, proving the combination works on the same tree.

Test plan

CI is green on this branch (all four Windows build jobs pass, both Java test jobs pass)
Existing MSVC Windows builds remain unchanged and unaffected
New Ninja jobs run C++ unit tests (ctest) and Java model-backed tests
Docs updated: README.md classifier table, CLAUDE.md "Windows Ninja artifact" section, TODO.md status

Related issues / PRs

Closes the Windows compiler cache task in TODO.md (deferred → shipped as dual-artifact design).

Checklist

I have read CONTRIBUTING.md and CODE_OF_CONDUCT.md
My commits follow Conventional Commits
No security-sensitive changes

https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…dows The two Windows native build jobs are the only uncached native builds: the Visual Studio generator ignores CMAKE_{C,CXX}_COMPILER_LAUNCHER, so sccache cannot be wired into them. Upstream llama.cpp ships its windows-cuda artifact with Ninja Multi-Config + MSVC, proving the combination works on the same tree. - build.bat: add an sccache probe guard mirroring build.sh's sccache_can_wrap_compiler() — when USE_CACHE=true and sccache is on PATH, compile a trivial TU through `sccache cl.exe`; only on success pass -DCMAKE_{C,CXX}_COMPILER_LAUNCHER=sccache and print `sccache --show-stats`. A missing/crashing sccache falls back to a green uncached build. Inert for the VS generator jobs (they do not set USE_CACHE). - publish.yml: add two evaluation-only jobs, build-windows-x86_64-ninja and build-windows-x86-ninja (windows-2025-vs2026, ilammy/msvc-dev-cmd@v1, sccache v0.16.0, Depot WebDAV env, `build.bat -G "Ninja Multi-Config"`). Artifacts are named Windows-*-ninja (NOT *-libraries) so the package job's `pattern: "*-libraries"` does not consume them; package's `needs:` is unchanged. They run in parallel with the trusted VS jobs. - TODO.md: flip the Windows cache section from "implementation notes" to "evaluation in progress"; document what remains (confirm cache hits, then rename artifacts to *-libraries, wire into package `needs`, retire VS jobs). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…ermanent MSVC Builds on the prior Ninja evaluation jobs. Per owner decision, the MSVC / Visual Studio Windows build is the default JAR and is kept permanently (the sccache cache loss on it is accepted); the Ninja Multi-Config build is shipped ALONGSIDE it as a separate classifier JAR, never as a replacement. The two generators produce different jllama.dll files, so they cannot share a resource path in one JAR — hence the classifier (mirrors the cuda / opencl-android pattern). Result: 4 permanent Windows build jobs, both generators distributed and tested end-to-end. - pom.xml: add `windows-ninja` profile producing a <classifier>ninja-windows</classifier> JAR from ${outputDirectory}_windows_ninja (separate compile pass + resource copy + classified jar; mirrors cuda / opencl-android). - publish.yml: the package, publish-snapshot, and publish-release jobs download Windows-{x86_64,x86}-ninja into src/main/resources_windows_ninja/ and activate the `windows-ninja` profile (-P ...,windows-ninja). Add a test-java-windows-x86_64-ninja job that loads the Ninja DLL via JNI and runs the full model-backed suite (parity with test-java-windows-x86_64). Wire the Ninja build + Java-test jobs into the package `needs:` graph. - .gitignore: ignore src/main/resources_windows_ninja/ (CI-staged, never committed). - README.md: add the `ninja-windows` classifier row + dependency snippet. - CLAUDE.md: add "Windows Ninja artifact" section; refresh the sccache "Windows" note (no CMakeLists change — routing is CI-download + pom-profile, not a GGML flag). - TODO.md: rewrite the Windows section to the final dual-build design (MSVC kept forever; remaining work is cache-hit verification, not a redesign). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…fix) The b9739 upgrade (#247) added tools/server/server-schema.cpp to the jllama library target but not to the jllama_test executable target. server-schema.cpp defines server_schema::eval_llama_cmpl_schema(), which test_server.cpp and the upstream server-context.cpp reference, so the test link failed with an undefined/unresolved external symbol on every platform that builds the tests (Linux x86_64, macOS, Windows MSVC, Windows Ninja). The shared library itself links fine — only jllama_test was missing the TU. Add server-schema.cpp to the jllama_test source list, mirroring its presence in the library target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…behavior b9739's server-schema validation changed two behaviors that the existing tests encoded from an earlier llama.cpp: - negative n_discard is now range-checked (0 <= value <= INT32_MAX) and throws, instead of being silently clamped to 0; and - every field-validation failure is wrapped in std::invalid_argument ("Field '<name>': ...", server-schema.cpp) rather than surfacing the inner std::runtime_error. Update the four affected assertions (NDiscard negative now expects a throw; dry_sequence_breakers / lora / repeat_last_n expect std::invalid_argument) and the descriptive comment block. With the prior server-schema.cpp test-link fix, the full C++ suite passes locally (446/446). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…neW) regression Root-cause write-up only (fix deferred per discussion). On Windows x86_64, b9739's common_params_parse has a _WIN32 prologue that replaces the caller-supplied argv with GetCommandLineW() (the host process command line). Correct for the standalone llama-server.exe, but under JNI the process is java.exe (no --model on its command line), so the parse fails with "Failed to parse model parameters" for every model-loading Java test. Affects both the MSVC and Ninja DLLs; Linux/macOS unaffected. Documents symptom, exact upstream code (arg.cpp:924-931), why our argv is platform-neutral, and three fix options to choose from later. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…cross LTS) The dockcross linux-arm64-lts image ships GCC 8.5 / glibc 2.17, which can no longer compile llama.cpp b9739 (its C++17 CTAD-in-`new` needs GCC >= 12). Mirror upstream llama.cpp's own aarch64 release job: build natively on GitHub's free ubuntu-24.04-arm runner with GCC 14. - publish.yml: rewrite crosscompile-linux-aarch64 as a native build (setup-java -> mvn compile -> build.sh) with -DGGML_NATIVE=OFF for ARMv8 portability, and run ctest on real ARM hardware for the first time. Job id kept (downstream needs:); display name -> "Build and Test Linux aarch64". - build.sh: generalize the Linux sccache auto-fetch from x86_64-only to map uname -m -> x86_64/aarch64 static-musl release (x86_64 path unchanged). - CLAUDE.md: document the native build and the glibc floor change (2.17 -> ~2.39, matching upstream's ARM binaries). Trade-off: the aarch64 artifact's glibc floor rises to ~2.39 (same envelope upstream's own ARM binaries require); x86_64 stays at glibc 2.17. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

The default CPU JAR's Linux aarch64 native is now built natively on ubuntu-24.04-arm (mirroring upstream), so it requires glibc >= 2.39. Document this in the "Choosing the right classifier" runtime-requirement column so users on older-glibc ARM hosts know to expect a load failure. Linux x86-64 stays at glibc 2.17 (manylinux2014). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…#24779) Introduce a generic source-patch mechanism for the FetchContent'd llama.cpp tree so fixes apply to every C++ build (all CI jobs + local) from one place: - patches/ : drop *.patch / *.diff here (applied in filename order) - cmake/apply-llama-patches.cmake : cross-platform (cmake -P), idempotent (git apply --reverse --check skips already-applied), fail-loud (a stale patch aborts configure so it can't be silently dropped from a release build) - CMakeLists.txt : wired as the llama.cpp FetchContent PATCH_COMMAND First patch, 0001-win32-arg-parse-embed-guard.patch, fixes the b9739 Windows JNI regression from upstream #24779: common_params_parse unconditionally replaced the caller's argv with the process command line (GetCommandLineW), so an embedded java.exe lost its --model args -> "Failed to parse model parameters". The guard adopts the process command line only when the re-derived arg count equals argc: true for the standalone llama-* tools (UTF-8 CLI fix preserved), false for a JVM host (our already-UTF-8 argv kept). Verified the patch applies cleanly to b9739 and the applier is idempotent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…make The REUSE license-compliance check (fsfe/reuse-action) failed on the new files: - cmake/apply-llama-patches.cmake had the license but no copyright -> add inline SPDX-FileCopyrightText (mirrors build.sh). - patches/*.patch is unified-diff format and cannot carry an inline header without corrupting the diff -> annotate via REUSE.toml with a patches/** glob, so every current and future patch in the directory is covered automatically. Verified locally: `reuse lint` -> compliant with REUSE 3.3. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

The count-guard variant of 0001-win32-arg-parse-embed-guard.patch fixed 21/25 Windows Java tests but collided on the 4 server-integration setups (Rerank, ToolCalling, Multimodal, OpenAiCompatServer) whose argv length happened to equal java.exe's process arg count — the guard then wrongly adopted java.exe's command line and they kept failing with "Failed to parse model parameters". Those tests pass on Linux/macOS, so their args are valid; the collision was the only cause. Switch to the deterministic fix: keep the make_utf8_argv() call referenced (no -Wunused-function) but never adopt its result, so the caller's already-UTF-8 JNI argv is always used. A JNI library is never the process, so the GetCommandLineW override is pure liability for us. Verified the patch applies cleanly to b9739 and the applier stays idempotent. CLAUDE.md + TODO.md updated to record the count-guard -> removal change; upstream PR can still expose an opt-out that preserves the standalone tools' UTF-8 fix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

… Rating) SonarCloud's GitHub Actions analyzer flagged these (they drive the "Security Rating on New Code = C" gate — the "C" is the rating grade, not the C language; the scan is the Maven scanner over YAML/Java, not CFamily): - clang-format.yml: pip install without --only-binary :all: can execute a source package's setup.py at install time. Force wheels-only (clang-format ships manylinux wheels). This is the only finding in the new-code window, so it is what fails the gate. Uses a block scalar so ":all:" doesn't break YAML. - osv-scanner.yml / scorecard.yml: tighten top-level "permissions: read-all" to "contents: read". Safe — every job in both files already declares its own exact permissions (security-events: write, id-token: write, etc.), which fully override the workflow default. Not touched: publish.yml's workflow-level "contents: read" (Sonar wants it at job level) — Low severity, not in the new-code window (doesn't affect the gate), already owner-assigned, and spreading it across ~25 jobs is invasive on the release pipeline. Left for a separate decision. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

SonarCloud flagged three Critical Reliability bugs in PairTest: assertNotNull on pair.hashCode(), which returns a primitive int that can never be null, so the assertions verified nothing. Replace them with a deterministic-hashCode check (capture the value, assert a repeated call returns the same int) — this exercises the real "doesn't throw + consistent across calls" intent. A local variable is used so the two assertEquals arguments are distinct expressions (avoids the identical-argument rule java:S5863). These are Reliability bugs, separate from the failing Security-Rating gate; fixing them improves the Reliability rating. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

- Add open items: SonarCloud "Security Rating on New Code" gate status (GHA findings fixed in 84297e0, publish.yml owner-accepted, PairTest reliability bugs fixed in 9f0d377, gate pending the last dashboard finding); the upstream llama.cpp PR to drop the local Windows arg-parse patch; the aarch64 branch- protection rename. - Mark resolved: Windows-JNI arg-parse regression (patch option 2 chosen); Windows Ninja verification (green + sccache hits). - Add Done section for b9739 upgrade, Windows Ninja classifier, native ubuntu-24.04-arm aarch64 build, the generic patches/ mechanism, and CUDA sccache verification. - Fix stale b9682 -> b9739 references (pin is b9739). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

The dd264b2 scaffold (NativeServer.java + NativeServerSmokeTest) landed the native-HTTP-transport Java entry point, but TODO.md tracked no follow-up for wiring the upstream server.cpp route table to JNI. Add it under the server follow-ups so the remaining native-server work is visible. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

…248 item The combined commit status on #248 shows a "License Compliance" check failing with "17 issues found" — a dependency-license scanner app, separate from REUSE (green) and SonarCloud. Document it as open, note it is almost certainly pre-existing (the PR changes no dependencies), and how to triage it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SfvSZ76NW4e1qX1PjL4RKq

sonarqubecloud · 2026-06-20T23:03:50Z

Quality Gate failed

Failed conditions
C Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

claude added 2 commits June 20, 2026 16:56

bernardladenthin temporarily deployed to startgate June 20, 2026 17:37 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 17:53 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 17:59 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 18:59 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 19:45 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 19:47 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 20:03 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 20:05 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 20:34 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 21:35 — with GitHub Actions Inactive

bernardladenthin temporarily deployed to startgate June 20, 2026 21:56 — with GitHub Actions Inactive

claude added 3 commits June 20, 2026 22:03

bernardladenthin temporarily deployed to startgate June 20, 2026 23:01 — with GitHub Actions Inactive

bernardladenthin merged commit 2b0fcf5 into main Jun 20, 2026
11 of 12 checks passed

bernardladenthin deleted the claude/exciting-fermi-ltkrku branch June 20, 2026 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Windows Ninja Multi-Config build with sccache caching#248

Add Windows Ninja Multi-Config build with sccache caching#248
bernardladenthin merged 15 commits into
mainfrom
claude/exciting-fermi-ltkrku

bernardladenthin commented Jun 20, 2026

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented Jun 20, 2026

Summary

Rationale

Test plan

Related issues / PRs

Checklist

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 20, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants